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METHOD AND SYSTEM FOR 
LUNG DISEASE DETECTION 

Background of Invention 

[0001] This invention relates to a method and system for processing medical image data 
to aid in the detection and diagnosis of disease, and more particularly, to a method 
and system for detecting lung disease in medical images obtained from a x-ray 
computed tomography (CT) system. 

[0002] A x-ray chest radiograph system is the more commonly used diagnostic tool 
useful for the purpose of detecting lung disease in humans. Lung disease such as 
bronchitis, emphesema and lung cancer are also detectable in chest radiographs and 
CT. However, CT systems generally provide over 80 separate images for a single CT 
scan thereby providing a considerable amount of information to a radiologist for use 
in interpreting the images and detecting suspect regions that may indicate disease. 

[0003] Suspect regions are defined as those regions a trained radiologist would 

recommend following through subsequent diagnostic imaging, biopsy, functional lung 
testing, or other methods. The considerable volume of data presented by a single CT 
scan presents a time-consuming process for radiologists. Conventional lung cancer 
screening generally involves a manual interpretation of the 80 or more images by the 
radiologist. Fatigue is therefore a significant factor affecting sensitivity and specificity 
of the human reading. In other diseases, such as emphysema, it is difficult for a 
radiologist to classify the extent of disease progression by only looking at the CT 
images. Quantitative analysis of the anatomy is required. 

[0004] Attempts to automate lung cancer and emphysema detection in CT scans have 
been based on a variety of nodule detection and classification techniques, and lung 
parenchyma metrics. The emerging field is referred to as Computer Aided Diagnosis, 
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or alternatively, Computer Aided Detection (CAD). There is a significant amount of 
literature on methods for automating lung cancer detection in CT scans. Generally 
nodule detection has proceeded in three steps: lung segmentation, vessel extraction, 
and final nodule candidate detection and classification. 

[0005] Vessel extraction has been attempted using gray-level thresholding, fuzzy 

clustering, and three- dimensional seeded region growing). Nodule detection has 
been done using template matching, genetic algorithms, gray-level thresholding, the 
N-Quoit filter, region growing, and edge-gradient techniques. 

[0006] Once candidate nodules are produced by any of the above methods, classification 
has been implemented via rule-based methods, neural network classification, fuzzy 
p logic, and statistical techniques including factor analysis and linear discriminating 

Si analysis. 

Hi 

W [0007] The above techniques presented to date, however, have largely focused on 
C identifying suspicious lesions in CT scans and have not directly addressed obtaining 

r* correct differentiation of structures in the lung and correct measurements of their 

y- size. Additionally, the above techniques are generally limited in the interpretative 

j^; nature of the results. Typically, identification and classification of a lesion using the 

O above techniques may produce a positive affirmation of a nodule, but further 

radiologist qualitative review and interpretation of results is generally required. For 
example, radiologists rely heavily on their familiarity with or expert knowledge of 
pathological and anatomical characteristics of various abnormal and normal structures 
in interpreting medical images. Further, the characteristics of the scanning device, 
such as type, pixel intensity and signal impulse response, also influence the 
presentation of the image data. A radiologists interpretation of medical images also 
generally relies on his or her familiarity with a given scanner. There has been no 
apparent evaluation by the above techniques to address the type of or characteristics 
of the scanning device in the analysis of the images produced. 

[0008] w ^at i S needed is a robust method and system for processing image data to 

produce quantitative data to be used in detecting disease. What is further needed is a 
method and system that provides interpretative results based on expert knowledge of 
a disease as well as the scanner capabilities and characteristics. Additionally, there is 
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a requirement for the ability to track a disease's progression/regression resulting 
from drug therapy. 

Summary of Invention 

[0009] In a first aspect, a method for processing medical images for use in the detection 
and diagnosis of disease is provided. The method comprises classifying regions of 
interest within the medical images based on a hierarchy of anatomical models and 
signal models of signal information of an image acquisition device used to acquire the 
medical images. The anatomical models are derived to be representative of anatomical 
information indicative of a given disease. 

[001 0] In a second aspect, a computer-aided system for use in the diagnosis and 

detection of disease is provided. The system comprises an image acquisition device 
for acquiring a plurality of image data sets and a processor adapted to process the 
image data sets. The processor is adapted to classify selected tissue types within the 
image data sets based on a hierarchy of signal and anatomical models and the 
processor is further adapted to differentiate anatomical context of the classified tissue 
types for use in the diagnosis and detection of disease. 

Brief Description of Drawings 

[001 1] The features and advantages of the present invention will become apparent from 
the following detailed description of the invention when read with the accompanying 
drawings in which: 

[001 2] Figure 1 is a block diagram illustration of a medical imaging system for which 
embodiments of the present invention are applicable; 

[001 3] Figure 2 is a flow diagram of a method for processing image data for use in 
detecting disease in accordance with embodiments of the present invention; 

[0014] Figure 3 is a flow diagram of a segmentation method useful in the medical 
imaging system of Figure 1 ; and, 

[001 5] Figure 4 is a block diagram illustration of a modeling method for use in detecting 
disease in accordance with embodiments of the present invention. 
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Detailed Description 

[001 6] Referring to Figure 1 , a general block diagram of a system 1 00 for disease 
detection is shown. System 100 includes an imaging device 110, which can be 
selected from a number of medical imaging devices known in the art for generating a 
plurality of images. Most commonly, computed tomography (CT) and magnetic 
resonance imaging (MRI) systems are used to generate a plurality of medical images. 

[001 7] During a CT imaging session, a patient lies horizontal and is exposed to a plurality 
of x-rays measured with a series of X-ray detectors. A beam of x-rays passes through 
a particular thin cross-section or "slice" of the patient. The detectors measure the 
amount of transmitted radiation. This information is used to compute the x-ray 
attention coefficient for sample points in the body. A gray scale image is then 
constructed based upon the calculated x-ray attenuation coefficients. The shades of 
gray in the image contrast the amount of x-ray absorption of every point within the 
slice. The slices obtained during a CT session can be reconstructed to provide an 
anatomically correct representation of the area of interest within the body that has 
been exposed to the x-rays. 

[001 8] During a MR imaging session, the patient is placed inside a strong magnetic field 
generated by a large magnet. Magnetized protons within the patient, such as 
hydrogen atoms, align with the magnetic field produced by the magnet. A particular 
slice of the patient is exposed to radio waves that create an oscillating magnetic field 
perpendicular to the main magnetic field. The slices can be taken in any plane chosen 
by the physician or technician (hereinafter the "operator") performing the imaging 
session. The protons in the patient's body first absorb the radio waves and then emit 
the waves by moving out of alignment with the field. As the protons return to their 
original state (before excitation), diagnostic images based upon the waves emitted by 
the patient's body are created. Like CT image slices, MR image slices can be 
reconstructed to provide an overall picture of the body area of interest. Parts of the 
body that produce a high signal are displayed as white in an MR image, while those 
with the lowest signals are displayed as black. Other body parts that have varying 
signal intensities between high and low are displayed as some shade of gray. 

[0019] 

Once initial MR or CT images have been obtained, the images are generally 



APP ID=09683111 



Page 4 of 36 



segmented. The segmentation process classifies the pixels or voxels of an image into 
a certain number of classes that are homogeneous with respect to some characteristic 
(i.e. intensity, texture, etc.). For example, in a segmented image of the brain, the 
material of the brain can be categorized into three classes: gray matter, white matter, 
and cerebrospinal fluid. Individual colors can be used to mark regions of each class 
after the segmentation has been completed. Once the segmented image is developed, 
surgeons can use the segmented images to plan surgical techniques. 

[0020] Generally, creating a segmented CT or MR image involves several steps. A data set 
is created by capturing CT or MR slices of data. Through the segmentation process, a 
gray scale value is then assigned to each point in the data set and different types of 
tissues will have different gray scale values. Each type of material in the data is 
assigned a specific value and, therefore, each occurrence of that material has the 
same gray scale value. For example, all occurrences of bone in a particular image may 
appear in a particular shade of light gray. This standard of coloring allows the 
individual viewing the image to easily understand the objects being represented in the 
images. 

[0021 ] F|G 1 i|| ustra tes a medical imaging system 1 00 to which embodiments of the 

invention are applicable. The system includes an imaging device 1 10, a processor 1 20 
and an interface unit 1 30. Imaging device 11 0 is adapted to generate a plurality of 
image data sets 240 and is, for example, a computed tomography (CT) or magnetic 
resonance (MR) scanner. In the context of CT or MR, acquisition of image data is 
generally referred to as "scans". Processor 1 20 is configured to perform computations 
in accordance with embodiments of the present invention which will be described in 
greater detail with reference to Figures 2-4. Processor 1 20 is also configured to 
perform computation and control functions for well-known image processing 
techniques such as reconstruction, image data memory storage, segmentation and the 
like. Processor 1 20 may comprise a central processing unit (CPU) such as a single 
integrated circuit, such as a microprocessor, or may comprise any suitable number of 
integrated circuit devices and/or circuit boards working in cooperation to accomplish 
the functions of a central processing unit. Processor 120 desirably includes memory. 
Memory within processor 1 20 may comprise any type of memory known to those 
skilled in the art. This includes Dynamic Random Access Memory (DRAM), Static RAM 
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(SRAM), flash memory, cache memory, etc. While not explicitly shown in FIG. 1 , the 
memory may be a single type of memory component or may be composed of many 
different types of memory components. Processor 1 20 is also capable of executing 
the programs contained in memory and acting in response to those programs or other 
activities that may occur in the course of image acquisition and image viewing. As 
used herein, "adapted to", "configured" and the like refer to mechanical or structural 
connections between elements to allow the elements to cooperate to provide a 
described effect; these terms also refer to operation capabilities of electrical elements 
such as analog or digital computers or application specific devices (such as an 
application specific integrated circuit (ASIC)) that are programmed to perform a sequel 
to provide an output in response to given input signals. 

[0022] Interface unit 1 30 is coupled to processor 1 20 and is adapted to allow human 

users to communicate with system 1 00. Processor 1 20 is further adapted to perform 
computations that are transmitted to interface unit 1 30 in a coherent manner such 
that a human user is capable of interpreting the transmitted information. Transmitted 
information may include images in 2D or 3D, color and gray scale images, and text 
messages regarding diagnosis and detection information. Interface unit 1 30 may be a 
personal computer, an image work station, a hand held image display unit or any 
convention image display platform generally grouped as part of a CT or MRI system. 

[0023] All data gathered from multiple scans of the patient is to be considered one data 
set. Each data set can be broken up into smaller units, either pixels or voxels. When 
the data set is two-dimensional, the image is made up of units called pixels. A pixel is 
a point in two-dimensional space that can be referenced using two dimensional 
coordinates, usually x and y. Each pixel in an image is surrounded by eight other 
pixels, the nine pixels forming a three-by-three square. These eight other pixels, 
which surround the center pixel, are considered the eight-connected neighbors of the 
center pixel. When the data set is three-dimensional, the image is displayed in units 
called voxels. A voxel is a point in three-dimensional space that can be referenced 
using three-dimensional coordinates, usually x, y and z. Each voxel is surrounded by 
twenty-six other voxels. These twenty-six voxels can be considered the twenty-six 
connected neighbors of the original voxel. 
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[0024] In an embodiment of the present invention, a computer-aided system for use in 
the diagnosis and detection of disease comprises an image acquisition device for 
acquiring a plurality of image data sets and a processor adapted to classify selected 
tissue types within the image data sets based on a hierarchy of signal and anatomical 
models. The processor is further adapted to differentiate anatomical context of the 
classified tissue types for use in the diagnosis and detection of a selected disease. The 
system further comprises an interface unit for presenting the classified tissue types 
within the image data sets and anatomical context of the classified tissue types for 
aiding an interpretation of the processed image data sets. The anatomical models are 
parametric, mathematical representations of anatomical tissues. The anatomical 
context comprises at least one of lung nodules indicative of lung cancer, healthy lung 
Q tissue, diseased lung tissue indicative of chronic obstructive pulmonary disease 

^ (COPD) and other pathological descriptions of tissue that can be characterized by 

i§ radiologists and further modeled mathematically. Further discussion of anatomical 

J2 context and mathematical modeling will be provided with reference to Figure 4. 

M [0025] In an exemplary embodiment, the imaging device is a x-ray CT scanner. A CT 
L system is particularly well adapted to acquire a plurality of images, or alternatively 

H; slices, of a region of interest. Also, in this exemplary embodiment, the imaging object 

D is a lung. It is to be appreciated that other imaging devices that provide a plurality of 

images, such as magnetic resonance (MR), would also benefit from embodiments of 
the present invention. Also, it is to be appreciated that other regions of interest other 
than the lung may be the imaging object, e.g. the heart, colon, limbs, breast or brain. 
The processing functions performed by processor 1 20 would be adapted to classify 
tissue types of interest in these other imaging objects. 

[0026] An em bodiment for a method for detecting disease from the plurality of medical 
images comprises the steps of acquiring the image data, processing of the acquired 
image data to define the lung region; computing low level features in the image using 
the known characteristics of the imaging device and the imaging process; grouping 
regions in the image, based on their features and an information object hierarchy 
describing their features, into anatomical structures; and, deciding if any of the 
grouped regions represents an area which is suspicious for a lung disease. The 
method further comprises presenting the areas identified as suspicious for lung 
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disease. The presenting step comprises presenting the anatomical context (e.g. lung 
nodule, diseased tissue, healthy tissue) and a decision process by which the 
suspicious areas were identified. The grouping of regions is performed using 
comparisons of signal and anatomical models using Bayes Factors. In a further 
embodiment, a method for characterizing tissue in medical images for use in disease 
diagnosis and detection comprises computing an information object hierarchy of 
increasing complexity to characterize anatomical tissue. The object hierarchy contains 
models, or alternatively mathematical representations, based on characteristics of an 
image acquisition device used in acquiring the images and based on anatomical 
characteristics of a selected region of interest and a specified disease. The grouping, 
the object hierarchy and Bayes Factor comparisons will be described in greater detail 
in paragraphs that follow and with reference to Figure 4. 

[0027] 

Referring to Figure 2, there is shown a more detailed flow diagram of an 
embodiment of a method for processing image data to be used in detecting disease. 
Image data is acquired at 210. These images are passed to processor 120 (Figure 1) 
for processing steps 220 280 of Figure 2. At step 220, the area of the images that 
represents the lung is determined by selection of various known segmentation 
techniques or, alternatively by an exemplary embodiment of pleural space 
segmentation which will be discussed in greater detail below with reference to Figure 
3. Resulting from step 220, input pixels from a CT scan are first classified to be either 
in the lung cavity or outside the lung. The input pixels are acquired from either a two- 
dimensional CT scan data set or, alternatively, from a three-dimensional CT scan data 
set. At 230, processor 1 20 then computes low-level signal models from the gray scale 
values of the image within the lung region. These models, for example, may include 
(but are not limited to) compact, bright objects; compact, dark objects; and long, 
bright objects. The low-level signal models are mathematical descriptions of 
structures being imaged after the measurement process of the scanner modifies them. 
Signal model processing continues at 250 to gain more information regarding a 
region of pixels in the image. In an embodiment of signal model processing for step 
250, different signal models are competed against each other in order to best explain 
a region of pixels in the images. The competition is desirably carried out by 
performing comparisons between the signal models using the known statistical-based 
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process of Bayes Factors. It is to be appreciated that other decision or statistical based 
methods may also be used. An exemplary embodiment using Bayes Factors will be 
described in greater detail below and with reference to Figure 4. 

[0028] After decisions have been made regarding the best low-level signal model, a 
further grouping process occurs at steps 260 and 270. This involves grouping the 
low-level models into anatomical structures such as particular areas of the lung. 
Again, the decision process involves competing anatomical models desirably using 
Bayes Factors in order to make an optimal decision as to model applicability. 

[0029] Finally, at step 280, results are presented. Results are based on the information 
provided by the low-level signal models and the anatomical models in order to 
provide qualitative and quantitative information regarding suspicion for lung disease. 
Decisions at this level are made in the same way that a radiologist might make 
decisions regarding a lung nodule because the system has both low-level signal 
knowledge and anatomical context. 

[0030] Referring to Figure 3, an embodiment for identifying the lung region at step 220 is 
provided. In this embodiment, a lung segmentation process is provided that 
automatically identifies the boundaries of the pleural space in a Computed 
Tomography (CT) data set. The boundary is either a set of two-dimensional (2D) 
contours in a slice plane or a three-dimensional (3D) triangular surface that covers the 
entire volume of the pleural space. The extracted boundary can be subsequently used 
to restrict Computer Aided Detection (CAD) techniques to the pleural space. This will 
reduce the number of false positives that occur when a lung nodule detection 
technique is used outside the pleural space. 

[0031] Referring further to Figure 3, the 3D surface identification proceeds as follows: 

[0032] 310 Acquire a CT data set that covers the lung. The extent of the CT exam should 
cover the entire region of the pleural space. The centerline landmark of the exam 
should run approximately down the center of the thorax. 

[0033] 31 1 Read the CT data set into memory. For efficiency, the data set should reside in 
contiguous memory, although others means of memory organization are possible. 
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[0034] 31 2 Select a threshold. Select an intensity value that corresponds approximately 
to air in the CT study. This intensity is called the threshold. The threshold can be 
chosen using a variety of means, but only needs to be done once per CT lung 
protocol. The same threshold can be used for all exams using the same protocol (e.g. 
scanning procedure). 

[0035] 31 3 Segment the study into foreground and background regions. Replace all 
samples that have values below the threshold with a positive constant foreground 
value. Replace all other samples with a 0, the background value. The actual 
foreground value is arbitrary. Samples marked with the foreground value will 
correspond to air while samples with a background value will correspond to other 
tissue. 

2 [0036] 314 Remove islands in the xy, xz and yz planes. Islands are groups of samples 
m that contain 0 but are surrounded by non-zero samples. Islands are removed by 

g setting their values to the foreground value. Only islands that are below a specified 

H island size are removed. The island size is chosen to be larger than the area of the 

cross-section of a vessel or bronchial passage and smaller than the area of 

background outside the CT circle of reconstruction. 



O [0037] 31 5 Select a seed in the pleural space. The seed is located in the middle slice, one 
ff quarter of the distance from the left of the image and one half of the distance from 

the bottom of the image. 

[0038] 316 Extract a 3D connected region. Using the seed as a starting point, mark all 
values that are connected to the seed, that have the same value as the seed. Other 
selected connectivity algorithms are also suitable. An exemplary technique is 
disclosed in US Patent 4,751 ,643 - METHOD AND APPARATUS FOR DETERMINING 
CONNECTED SUBSTRUCTURES WITHIN A BODY. 



[0039] 



317 Extract surfaces. Extract a surface comprised of triangles using an isosurface 
extraction technique. The isosurface corresponds to a value midway between the 
foreground and background values. Any isosurface extraction technique can be used. 
An exemplary isosurface extraction technique is the well-known Marching Cubes 
algorithm as described in US Patent 4,71 0,876 - SYSTEM AND METHOD FOR THE 
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DISPLAY OF SURFACE STRUCTURES CONTAINED WITHIN THE INTERIOR REGION OF A 
SOLID BODY. 

[0040] The 2D contour identification proceeds as follows. Steps 310-316 correspond to 
steps 310-316 for the 3D surface. 

[0041] 310 Acquire a CT data set that cover the lungs. 

[0042] 31 1 Read the CT data set into memory. 

[0043] 31 2 Select a threshold. 

[0044] 31 3 Segment the study into foreground and background regions. 

p [0045] 314 Remove islands in the xy, xz and yz planes. 

|£f [0046] 31 SSelect a seed for the pleural space. 

jV? [0047] 31 6 Extract a 3D connected region. 

[0048] 31 7 Extract a clipped portion of the volume data set that corresponds to the right 
y* pleural space. The clipped region should extent beyond the centerline of the data by a 

J fixed percentage. This is to accommodate plural cavities that may cross the centerline. 

O A 20% overlap seems appropriate for lung studies. 

^ [0049] 31 8 Identify the contours in the left plural space. Extract contours comprised of 

line segments using a contour extraction technique. Any contour extraction technique 
can be used. An exemplary embodiment is the Marching Squares algorithm, a 
specialization of Marching Cubes described in US Patent 4,71 0,876 - SYSTEM AND 
METHOD FOR THE DISPLAY OF SURFACE STRUCTURES. 

[0050] 31 9 Sort the contours by line segment count and keep the contour with the 

largest number of line segments. This contour corresponds to the right pleural space. 

[005 1 ] 320 Extract a clipped portion of the volume data set that corresponds to the left 
pleural space. This is the same as step 31 8 except the clipping region is specified 
from the right side of the images. 



[0052] 



321 Identify the contours of the left plural space. This is the same as step 31 8, 
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applied to the region in step 320. This contour corresponds to the left pleural space. 

[0053] Employing the above described embodiments of the segmentation process, 
enables an automatic selection of all algorithm parameters based on the specific 
anatomy of the lung and the CT examination protocol. Further, the island removal is 
performed in three consecutive second passes, each in a different plane. It is to be 
appreciated that identifying the lung region initially allows a reduction in computation 
time and complexity for the downstream measurements. 

[0054] Referring to Figure 4, an embodiment of a hierarchy of models to be used in the 
method of Figure 2 and a method for processing within the hierarchy are provided. 
The hierarchy of models comprise models of various levels comprising signal model 
data, geometric model data, and anatomical model data. Those pixels that have been 
classified as being inside the lung region at step 220 (Figure 1) are modeled at several 
levels of modeling structure, herein after referred to as the hierarchy. As used herein, 
models refer generally to mathematical representations or, alternatively, mathematical 
translations. 

[0055] At a first or low level, characteristics of the imaging device are translated into 
mathematical representations. Characteristics of the imaging device that are of 
interest are those characteristics that generally affect the display and resolution of the 
images or otherwise affect a radiologist's interpretation of regions in the image. For 
example, the scanner point spread function is a measurable indicator of the image 
formation process and may be mathematically modeled. Other indicators of the image 
formation process include X-ray density, brightness, resolution and contrast. 

[0056] At a second or intermediate level, fitted shape models are derived to explain the 
geometry and intensity surface of various tissues. Shape and geometric model 
information is derived from anatomical information and expert radiologist 
observations which will be described in greater detail with respect to Figure 4. 

[0057] 

With one pass through the hierarchy, low-level pixel information (X-ray density) is 
transformed into anatomical information. This anatomical information is a 
classification of all pixels into lung tissue types, e.g. blood vessel, lung matrix, and 
lung cancer nodule. The models for the intermediate level are generally derived from 
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pathological information for a region of interest (e.g. lung) and a specific disease (e.g. 
lung cancer or COPD) obtained from expert information, for example radiologists that 
have observed recurring characteristics for certain types of lung disease. The expert 
information is desirably from a radiologist or a plurality of radiologists experienced 
with the region of interest and detection of the specific disease. 

[0058] The models described above create a hierarchy of information objects of 

increasing radiological importance that are explicitly modeled. For example, as is 
shown in Figure 4, lung nodules and vascular structure are indicators of lung disease 
such as lung cancer. Additionally, lung parenchyma metrics are also indicators of lung 
disease. Based on radiologists" observations, or alternatively, other disease experts, a 
set of geometric and shape characteristics are obtained. For example, lung cancer 
nodules are generally compact, bright and spherical in nature. Further, lung cancer 
nodules that are likely to be cancerous tend to be spiculated (spidery vessel 
structures). In embodiments of the present invention, these characterizations of 
disease, such as lung cancer nodules, are mathematically represented as nodule 
model 470 and vessel model 480 as shown in Figure 4. Of further interest is lung 
matrix tissue which can be considered background to the vessels and nodules and for 
embodiments of the present invention is also modeled as the mathematical 
representation of lung matrix tissue model 490. 

[0059] Referring further to Figure 4, nodule model 470, vessel model 480 and lung 

matrix tissue model 490 represent a high level explanation in the hierarchy used to 
distinguish various lung tissues. Each of the high level models are further defined at 
low and intermediate levels. For example, nodules are generally spherical and bright 
(measurable in Hounsfeld units). Thus, a shape model representing intensity region 
formation 440 and a signal model for step edge detection 410 are derived 
mathematically to enable identification of a potential nodule. In a further example, 
spiculated nodules tend to have a compact core structure fed by one or more vessels 
and also having a spidery or spiculated structure. A shape model representing 
intensity ribbon formation 450 and a signal model representing fold edge detection 
420 similarly enable identification of a potential spiculated nodule. Background lung 
tissue is also similarly defined by low and intermediate levels of a texture region 
formation model 460 and a sub-resolution texture detection model 430. 
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[0060] At each level of the hierarchy, decisions are made via estimation of model 
parameters as to the characteristics of each information object, that is, each 
increasingly high-level explanation of the pixels in the images. The modeling steps at 
each level of the hierarchy. 

[0061] 1) Pixel information as the output of an image formation process; and, 

[0062] 2) Fitted shape models are derived to explain the geometry and intensity surface 
of various tissues. 

[0063] In the first modeling step (step edge detection 410), pixel information is analyzed 
at the output of an image formation process (240 of Figures 1 and 2). The tissue 
boundaries are identified using convolution operators. Nodule candidates are 
localized by convolving the images with differential kernels defined by the signal 
impulse response of the imaging device. In an embodiment, images acquired from a 
GE LightSpeed Scanner were used and a Canny edge detector was used with a 
smoothing parameter of 1 .1 pixels. An embodiment of the Canny edge-detection 
algorithm can be found in a description of Canny86. 

[0064] The vascular structure is localized by convolving the images with differential 

kernels defined by the signal impulse response of the imaging device using fold edge 
detection 420. In this embodiment using images acquired from a GE LightSpeed 
Scanner, a fold edge detector was used with a smoothing parameter of 1 .5 pixels. 

[0065] Background tissue is represented as sub-resolution texture by sub-resolution 
texture detection 430. Background tissue is localized by identifying regions of low 
intensity. Convolution kernels defined by the signal impulse response of the imaging 
device are used to identify potential background regions. In this embodiment using 
images acquired from a GE LightSpeed Scanner, a Canny edge detector is used with a 
smoothing parameter of 1 .1 pixels. At this stage of processing, the list of background 
regions is trimmed by thresholding at an average intensity of 520. An alternate 
localization procedure consists of modeling the background tissue as generalized 
intensity cylinders with random orientation. In this implementation, localization is 
achieved by comparing the output of generalized-cylinder model with the image 
intensities. 
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[0066] In the second modeling step, fitted shape models are used to explain the 

geometry and intensity surface of the various tissues. Putative nodule candidates are 
formed by grouping the output of the signal model stage into regions at intensity 
region formation step 440. Region grouping is performed by extrapolating edge 
segments perpendicular to the edge gradient. Edges ending near vertices associated 
with other edges are connected to form regions. In this implementation, the distance 
threshold for connecting edge segments is 4 pixels. 

[0067] The vascular structure is obtained at intensity ribbon formation step 450 by 

linking together the output of the fold edge detection 420. At each point on the chain, 
a width of the chain is defined by locating the nearest step-edge on each side in a 
direction perpendicular to the chain direction. Through this sweeping operation, a set 
^ of intensity ribbons is defined. These ribbons are implicitly defined by the centerline 

J of the fold-edges and the width of the fold along its entire length. These ribbons are 

|i| considered "candidate vessels", that is, objects which may be defined as blood vessels 

[? in the next level of the hierarchy. 

« [0068] At step 460 (Texture region formation), background lung tissue and lung matrix 
H tissue are modeled. Background lung tissue is obtained by grouping together regions 

ill output by the signal operators. Regions are formed by extrapolating edge segments 

J perpendicular to the edge gradient. Edges ending near vertices associated with other 

■H* edges are connected to form regions. In this implementation, the distance threshold 

for connecting edge segments is 4 pixels. 

[0069] Turning now to the decision process to determine which of the candidate nodules 
found in modeling step 2 are true lung cancer nodules. At this point, there are 
essentially two competing segmentations of the CT image — a region segmentation 
and a ribbon segmentation. Each region is a candidate nodule, and it must be decided 
whether the region, with an appropriate model on pixel intensities and region shape, 
is a better explanation of its interior pixels than any possible vessel or background 
explanation. To accomplish this, the two models are compared at step 500 using the 
Bayes Factor. The competition framework is a pair wise comparison of the modeled 
information: nodule vs. vessel, and nodule vs. background. If the nodule "wins" each 
competition, then it is considered a suspicious region and is reported as such. 
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[0070] As used herein, "Bayes Factors" refer to a known decision mechanism to ensure 
that the optimal decision is made given the input parameters. Applying the Bayes 
Factor to embodiments of the present invention provides that optimal decisions will 
be made given the statistical models of the shapes and signals provided by the 
radiologists' expert observations. This optimality assumes the statistical models of 
each anatomy type represent all the relevant knowledge embodied in a trained 
radiologist, and that the radiologist acts in a rational manner as defined by the Bayes 
Factor mechanism. Thus, the hierarchy of information enables processing to make a 
same decision as a radiologist would make regarding a region or nodule. Also, as 
used herein Bayes Factors will be used interchangeably herein with the term "Bayesian 
model competition". 

S [0071] 



To begin the competition, for each candidate nodule a patch of pixels around the 
candidate is considered. In the nodule vs. vessel competition, this patch is defined as 
the union of the candidate nodule and each conflicting ribbon, in turn. In the nodule 
vs. background competition, the patch is defined as all pixels below a pre-specified 
intensity threshold within a pre-specified radius of the geometric center of the 
candidate, unioned with the candidate nodule. The radius is desirably set to 10 pixels, 
and the intensity threshold is desirably set to 520 CT units. Once the competition 
patch has been defined, each of the two competitions proceed in the same fashion. 
The following table gives all the information that is available during processing, 
[tl] 
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Table 1 



Notation 


Description 


X 


intensity data 


81 


geometrical measurement on ''model V\ eg. "one 
nodule, background elsewhere" 




geometrical measurement on model 2, eg. "one 
ribbon, background elsewhere" 


rtxiM-i^e,) 


intensity model given that Model 1 is correct (p is 
a nuisance parameter) 


p(xjM=2, fc^ 


intensity model given that Model 2 is correct 


p(Pi[M=l) 


prior distribution on the intensity model nuisance 
parameter, given model 1 is correct 


p(62|M=2> 


prior distribution on the intensity model nuisance 
parameter, given model 2 is correct 


pcetiM=i) 


prior distribution on the geometrical parameters in 
model 1 


p(92|M=2) 


prior distribution on the geometrical parameters in 
model 2 


p(M=l) 


prior probabilities on model 1 (the "incidence" of 
model 1) 


P(M=2) 


prior probabilities on model 2 



[0072] In each row of Table 1 , the notation p(a|b) indicates the conditional probability 
distribution on the random variable a given the value of the random variable b. 

[0073] The intensity model x . , given the candidate nodule is correct as a normal 

distribution with mean equal to a 2 parameter parabaloid, and constant variance, is 
defined as: 

(1) 

[0074] 

where (u ,v ) is the two-dimensional location of the ith pixel, after a rotation 
i i 

and translation that forces the least squares estimates of all other parameters in the 
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full 6-parameter parabaloid in two dimensions to be zero. The error term e is 
normally distributed with zero mean and fixed variance, estimated off-line from true 
nodule data. As used herein, "off-line" estimation refers to known information learned 
or known beforehand, such as chances or likelihood information. 

[0075] The intensity model given the vessel is correct as a two-parameter parabaloid is 
defined as: 

(2) 

[0076] where in this case u . is defined as a unit vector in the direction normal to the 
fold-edge chain direction. The fold edge center of the ribbon is defined as u =0. 
Spans are defined at one-pixel separation along the chain, and each span's intensity 
data is modeled independently according to the above model. The error term is again 
normally distributed. 

[0077] The background model is defined as independent normal data at each pixel with 
unknown mean and fixed variance, estimated off-line from true background data. 
This data is gathered by an expert and the variance is estimated using the usual 
normal-model unbiased estimate. 

[0078] Prior distributions are defined on all intensity model parameters as normal 

distributions with means and covariance matrices estimated off-line from manually 
segmented intensity data. Prior distributions on shape parameters are defined as 
uniform distributions on key shape characteristics like nodule aspect ratio and size. 
Prior probabilities on each model are determined via a known scanner parameter 
known as Receiver Operating Characteristic curves according to pre-specified 
sensitivity and specificity targets. 



[0079] 



To decide the winner of each competition, the Bayes Factor is calculated. The 
Bayes Factor as used herein refers to the ratio of posterior model probabilities given 
the intensity and shape data calculated in the last level (Step 3) of modeling hierarchy 
for two given models M=l and M=2, 
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p(M=l|x,fl„g 2 ) 



p(M =2\x,d 1 ,$ 1 )' 



(3) 

This ratio can be written as 

P (M ^ii^A) _ p(x\9 1 M=i)p(eAM^i)p(e 2 \d 1 M=i)p(M=-i) 

p(M = 2 1 xAA) P( x I ®iM = 2)p(9 2 1 M = 2)pi0 x \ 6 2 M = 2)p(M -2) 

(4) 

where the factor 
p(0 l \B 2i M=2) 

(5) 

[0080] is assumed equal to one. The Bayes factor is necessarily greater than zero, and it 
indicates evidence for model 1 if the factor is greater than one (and vice versa). 

[0081 ] Candidate nodules which give Bayes factors greater than one in both competitions 
are deemed suspicious, and are superimposed the CT data in a visualization tool 
(presenting step 280 of Figure 2). The characteristics of these suspicious nodules are 
also stored for further follow-up. 

[0082] 

It is to be appreciated that incorporating knowledge about the imaging processing 
and the imaging device into the analysis techniques increases the accuracy and 
robustness of image measurements. The competition framework provides a robust 
method for making a model selection decision. Modeling the anatomy in the images 
improves the robustness of the image measurements and allows results to be 
presented to doctors in the context of the anatomy. The anatomical models are easily 
explained to physicians and their expert knowledge is coherently incorporated into 
the system (in the form of mathematical approximations of anatomical features). The 
lowest level of the modeling hierarchy relies on time-tested image formation and 
understanding techniques which are firmly grounded in human visual perception. 
Anatomical models are chosen via Bayes Factors, enable optimal decision given our 
statistical models. No training data (such as the voluminous training data needed for 
neural networks) is required, but it can be used to supplement expert knowledge in 
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the specification of prior distributions. The model-based approach allows 
incorporation of expert information elicited from radiologists. The results are reported 
to a radiologist or doctor incorporating the anatomical context of each suspicious 
region and the decision process by which suspicious regions were identified. 

[0083] In a further embodiment, the results are reported to a radiologist or doctor 

(hereinafter "user") in a manner that the user receives anatomical context, reasons for 
the decision of whether the region is of a particular type (nodule or vessel), and 
information of importance to the user. Information of radiological importance are, for 
example, size of nodule, number of vessels, evidence of spiculation, 
chances/likelihood of cancer or disease, brightness measurements and other 
characteristic information related to the disease at issue. Processor 1 20 of Figure 1 is 
adapted to perform the computations needed to support this reporting functionality. 
In a further embodiment, processor 1 20 is adapted to allow for user queries regarding 
particular regions of interest such as pointing to a region and receiving information 
such as size, number of vessels and brightness for the selected region. 

[0084] The embodiments above related to lung cancer detection and specifically to the 

distinction between lung nodules and vessels. In further embodiments, additional lung 
disease characteristics are similarly modeled such as the low-density, sponge-like 
texture which is generally characteristic of emphysema. As was described with 
reference to nodules and vessels, anatomical feature descriptions are obtained by 
experts (e.g. radiologists) and mathematically represented as a hierarchy. In further 
embodiments, models are derived for diseases that occur in other areas, such as the 
brain, colon and heart. 

[0085] Also, in a further embodiment, the hierarchy of models may be used in known 
neural network techniques as the training data to identify low and intermediate 
information and prior distributions. It is desirable that the Bayes Factor analysis be 
applied at the higher levels to provide useful and interpretative diagnosis data and the 
decision process. 

[0086] | n a f ur ther embodiment, processor 1 20 is further adapted to store the anatomical 
context and processed image data sets to be searched and retrieved remotely . In this 
embodiment, the information developed at each level in the model hierarchy is stored 
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in systems used for medical archives, medical search and retrieval systems and 
alternate medical disease reporting systems. Further in this embodiment, information 
that may be searched and retrieved include: pathological and anatomical models 
derived for characteristics of diseases, images representative of the diseases, and 
results of the model hierarchy computations (processed image data sets). The 
capability of storing/retrieving information for a particular diseased tissue type 
enables broader access to the information, such as via the Internet, a hospital 
information system, a radiological information system, or other information 
transmission infrastructure. Additionally, this information allows matching and 
retrieval of exams classified as similar based on the information provided by model 
hierarchy computations. 

[0087] In further embodiments, processor 1 20 is adapted to automatically send detailed 
exam information to remote workstations or portable computing device via an 
information transmission infrastructure. In a further embodiment of processor 1 20, 
processor 120 is adapted to automatically send detailed exam information which 
meets selected specified requirements determined in advance of transmission or 
determined adaptively by the processing system. In order to further tune or adjust 
analysis programs, processor 120 is also adapted to tune at least one computer 
analysis algorithm based on information from model hierarchy computations stored in 
previous exams. 

[0088] Also, in another further embodiment, processor 1 20 is further adapted to 

generate statistical measurements based on the information from model hierarchy 
computations stored in previous exams and report results of the statistical 
measurements to a local or remote monitoring facility. In this embodiment, processor 
1 20 may also be configured to report the results of the statistical measurements if 
predetermined criteria based on the system performance are met. 

[0089] 

In an exemplary embodiment, the steps outlined in the last section are 
implemented in C++ code based on the Targetjr image understanding library 
(http://www.targetjr.org). A set of DICOM (Digital Image and Communication in 
Medicine) image files, one for each slice in the CT scan, are input into the program 
and the program returns suspicious nodules to be visualized on the original CT data 
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or saved for further follow-up. It is to be appreciated that other coding software 
known to one skilled in the art would not depart from the spirit of the invention. 

[0090] The embodiments of the invention presented in previous paragraphs focus on the 
problem of locating suspicious regions in CT lung scans. It is to be appreciated that 
the hierarchical image modeling framework can be directly transferred to other 
imaging modalities (for example MRI, X-ray, ultrasound scanner, positron emission 
tomography (PET) scanner) and diseases by re-specifying the low-level detection 
techniques and the statistical distributions of anatomy. 

[0091] While the preferred embodiments of the present invention have been shown and 
described herein, it will be obvious that such embodiments are provided byway of 
example only. Numerous variations, changes and substitutions will occur to those of 
skill in the art without departing from the invention herein. Accordingly, it is intended 
that the invention be limited only by the spirit and scope of the appended claims. 
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